Data assimilation (DA) refers to methodology for combining:
to produce an improved estimate for the state of a time-evolving, random process and the parameters that govern its evolution.
My research experience is in developing scalable data assimilation methodology.
I utilize statistical, mathematical and numerical tools for understanding:
for problems inspired by high-dimensional, chaotic dynamical systems characteristic of weather and climate.
Setting the gradient \( \nabla_{\pmb{w}} \mathcal{J} \) equal to zero for \( \overline{\pmb{w}} \) we find the critical value as
\[ \begin{align} \overline{\pmb{w}} = \pmb{0} - {\boldsymbol{\Xi}_\mathcal{J}}^{-1} \nabla_{\pmb{w}} \mathcal{J}|_{\pmb{w}=\pmb{0}}. \end{align} \] where \( \boldsymbol{\Xi}_\mathcal{J}:= \nabla^2_{\pmb{w}}\mathcal{J} \) is the Hessian of the cost function.
The forecast mean is updated as,
\[ \begin{align} \overline{\pmb{x}}_L^\mathrm{filt}:= \overline{\pmb{x}}_{L}^\mathrm{fore} + \boldsymbol{\Sigma}_{L}^\mathrm{fore} \overline{\pmb{w}}. \end{align} \]
Defining a right-transform matrix, \( \mathbf{T}:= \boldsymbol{\Xi}^{-\frac{1}{2}}_{\mathcal{J}} \), the update for the covariance is given as
\[ \begin{align} \mathbf{B}^\mathrm{filt}_L = \left(\boldsymbol{\Sigma}_L^\mathrm{fore} \mathbf{T} \right)\left(\boldsymbol{\Sigma}_L^\mathrm{fore} \mathbf{T} \right)^\top. \end{align} \]
This derivation sketches the square root Kalman filter,3 written for the perfect, linear-Gaussian model.
Under the perfect model assumption, the forecast is furthermore written
\[ \begin{align} \overline{\pmb{x}}_{L+1}^\mathrm{fore}:= \mathbf{M}_{L+1} \overline{\pmb{x}}_{L}^{\mathrm{filt}} & & \boldsymbol{\Sigma}_{L+1}^\mathrm{fore} := \mathbf{M}_{L+1}\left(\boldsymbol{\Sigma}_{L}^\mathrm{filt}\right) \end{align} \] giving a complete recursion in time, within the matrix factor.
The square root Kalman filter analysis is written entirely in terms of the weight vector \( \overline{\pmb{w}} \) and the right-transform matrix \( \mathbf{T} \).
This is core to the efficiency of the ensemble transform Kalman filter (ETKF),4 using a reduced-rank approximation to the background \( \mathbf{B}^\mathrm{i}_L \).
Given an ensemble matrix \( \mathbf{E}^\mathrm{i}_L \in \mathbb{R}^{N_x \times N_e} \)
Writing the cost function restricted to the ensemble span, we have a restricted Hessian \( \boldsymbol{\Xi}_{\widetilde{\mathcal{J}}} \) of the form
\[ \begin{align} \boldsymbol{\Xi}_\mathcal{\widetilde{J}} \in \mathbb{R}^{N_e \times N_e}. \end{align} \]
Using ensemble-based, empirical estimates,
\[ \begin{align} & & \hat{\pmb{x}}_L^\mathrm{fore} &= \mathbf{E}_L^\mathrm{fore} \pmb{1} / N_e ; & \hat{\pmb{\delta}}_L &= \mathbf{R}_k^{-\frac{1}{2}}\left(\pmb{y}_L - \mathbf{H}_L \hat{\pmb{x}}_L\right)\\ &&\mathbf{X}_L^\mathrm{fore} &= \mathbf{E}_L^\mathrm{fore} - \hat{\pmb{x}}^\mathrm{fore}_L \pmb{1}^\top ; & \mathbf{P}^\mathrm{fore}_L &= \mathbf{X}_L^\mathrm{fore} \left(\mathbf{X}_L^\mathrm{fore}\right)^\top / \left(N_e - 1\right);\\ & &\mathbf{S}_L &:=\mathbf{R}_L^{-\frac{1}{2}}\mathbf{H}_L \mathbf{X}_L^\mathrm{fore}; \end{align} \]
the ensemble-based cost function is written as
\[ \begin{alignat}{2} & & {\color{#d95f02} {\widetilde{\mathcal{J}} (\pmb{w})} } &= {\color{#1b9e77} {\frac{1}{2} \parallel \hat{\pmb{x}}_L^\mathrm{fore} - \mathbf{X}^\mathrm{fore}_L \pmb{w}- \hat{\pmb{x}}^\mathrm{fore}_L \parallel_{\mathbf{P}^\mathrm{fore}_L}^2} } + {\color{#7570b3} {\frac{1}{2} \parallel \pmb{y}_L - \mathbf{H}_L\hat{\pmb{x}}^\mathrm{fore}_L - \mathbf{H}_L \mathbf{X}^\mathrm{fore}_L \pmb{w} \parallel_{\mathbf{R}_L}^2 } }\\ \Leftrightarrow& & {\color{#d95f02} {\widetilde{\mathcal{J}}(\pmb{w})} } &= {\color{#1b9e77} {\frac{1}{2}(N_e - 1) \parallel\pmb{w}\parallel^2} } + {\color{#7570b3} {\frac{1}{2}\parallel \hat{\pmb{\delta}}_L - \mathbf{S}_L \pmb{w}\parallel^2 } } \end{alignat} \] which is an optimization over a weight vector \( \pmb{w} \) in the ensemble dimension \( N_e \) rather than in the state space dimension \( N_x \).
This a key reduction that makes Monte Carlo methods feasible for the large size of geophysical models.
One alternative to constructing the tangent-linear and adjoint models is to perform a hybrid, analysis as based on the ETKF.
This approach is at the basis of the iterative ensemble Kalman filter / smoother (IEnKF/S).11
This technique seeks to perform an ensemble analysis like the square root ETKF by defining the ensemble estimates and the weight vector directly in the ensemble span
\[ \begin{alignat}{2} & & {\color{#d95f02} {\widetilde{\mathcal{J}} (\pmb{w})} } &= {\color{#d95f02} {\frac{1}{2} \parallel \hat{\pmb{x}}_{0|L-S}^\mathrm{smth} - \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w}- \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} \parallel_{\mathbf{P}^\mathrm{smth}_{0|L-S}}^2} } + {\color{#7570b3} {\sum_{k=L-S+1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1}\left( {\color{#d95f02} { \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} + \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w} } } \right)}}\parallel_{\mathbf{R}_k}^2 } }\\ \Leftrightarrow & & {\color{#d95f02} {\widetilde{\mathcal{J}} (\pmb{w})} } &= {\color{#d95f02} {(N_e - 1) \frac{1}{2} \parallel \pmb{w}\parallel^2} } + {\color{#7570b3} {\sum_{k=L-S+1}^L \frac{1}{2} \parallel \pmb{y}_k - \mathcal{H}_k\circ {\color{#1b9e77} { \mathcal{M}_{k:1}\left( {\color{#d95f02} { \hat{\pmb{x}}^\mathrm{smth}_{0|L-S} + \mathbf{X}^\mathrm{smth}_{0|L-S} \pmb{w} } } \right)}}\parallel_{\mathbf{R}_k}^2 } }. \end{alignat} \]
The scheme produces an iterative estimate using a Gauss-Newton-11 or, e.g., Levenberg-Marquardt-based12 optimization.